Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #36833 - Add SecureBoot support for arbitrary operating systems to "Grub2 UEFI" PXE loaders #877

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

goarsna
Copy link

@goarsna goarsna commented Oct 16, 2023

This feature consists of two patches, one for foreman and one for smart-proxy.

This patch introduces a new loader of kind :PXEGrub2TargetOS which allows to provide host-specific Network Bootstrap Programs (NPB) in order to enable network based installations for SecureBoot-enabled hosts.

SecureBoot expects to follow a chain of trust from the start of the host to the loading of Linux kernel modules. The very first shim that is loaded basically determines which distribution is allowed to be booted or kexec'ed until next reboot.

The existing "Grub2 UEFI SecureBoot" is not sufficiant as it limits the possible installations to the vendor of the Foreman (Smart Proxy) host system.

Providing shim and GRUB2 by the vendor of the to-be-installed operating system allows Foreman to install any operating system on SecureBoot-enabled hosts over network.

To achieve this, the host's DHCP filename option is set to a shim path in a directory that is host-specific (contains MAC address). Corresponding shim and GRUB2 binaries are copied into that directory along with the generated GRUB2 configuration files as we know from "Grub2 UEFI".

The required binaries must be provided once in the so called "bootloader universe". This directory can be configured via the settings file /etc/foreman-proxy/settings.d/tftp.yml and defaults to /usr/local/share/bootloader-universe/<os>/. These binaries can be manually retrieved from the installation media and is not part of this patchset.

Full example:

[root@vm ~]# hammer host info --id 241 | grep -E "(MAC address|Operating System)"
    MAC address:  00:50:56:b4:75:5e
    Operating System:       Ubuntu 22.04 LTS

[root@vm ~]# tree /usr/local/share/bootloader-universe/ /usr/local/share/bootloader-universe/
|-- centos
|   |-- grubx64.efi
|   `-- shimx64.efi
`-- ubuntu
    |-- grubx64.efi
    `-- shimx64.efi

[root@vm ~]# hammer host update --id 241 --build true

[root@vm ~]# tree /var/lib/tftpboot/grub2/00-50-56-b4-75-5e/ /var/lib/tftpboot/grub2/00-50-56-b4-75-5e/
|-- grub.cfg
|-- grub.cfg-00:50:56:b4:75:5e
|-- grub.cfg-01-00-50-56-b4-75-5e
|-- grubx64.efi
|-- shimx64.efi
`-- targetos

[root@vm ~]# grep -B2 00-50-56-b4-75-5e /var/lib/dhcpd/dhcpd.leases
  hardware ethernet 00:50:56:b4:75:5e;
  fixed-address 192.168.145.84;
        supersede server.filename = "grub2/00-50-56-b4-75-5e/shimx64.efi";

[root@vm ~]# pesign -S -i /var/lib/tftpboot/grub2/00-50-56-b4-75-5e/grubx64.efi | grep Canonical The signer's common name is Canonical Ltd. Secure Boot Signing (2021 v1)

@theforeman-bot
Copy link
Member

Can one of the admins verify this patch?

@goarsna
Copy link
Author

goarsna commented Oct 17, 2023

I have opened a PR to document this feature in foreman-documentation:
theforeman/foreman-documentation#2145

@goarsna
Copy link
Author

goarsna commented Oct 20, 2023

This PR also requires the changes from Foreman PR #9864

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd advertise this as an optional capability. In https://theforeman.org/2019/04/smart-proxy-capabilities-explained.html I've written about this. The benefit is two fold. On the one hand, it can disabled if bootloader_universe is not set. Foreman can detect this. It can also help with n-1 support where Foreman x.y runs with a Foreman Proxy x.(y-1).

One example where you can see it in Foreman is https://github.com/theforeman/foreman/blob/66f0c87b26c9b3a125bd216eb26c9a28b06effb3/app/models/concerns/orchestration/dhcp.rb#L84-L86 but Katello also uses it for the various content types.

modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/tftp_api.rb Outdated Show resolved Hide resolved
@goarsna
Copy link
Author

goarsna commented Jan 25, 2024

I encountered an error on a standalone Smart Proxy with the check if the bootloader directory is configured and fixed it.

@goarsna
Copy link
Author

goarsna commented Feb 15, 2024

Hi there,
I have rethought the whole concept of how we wanted to add SecureBoot support for arbitrary operating systems due to the discussions in Foreman Community Forum and came up with an idea how to completely integrate what we want into the existing PXEGrub2 PXE loaders.

What's left to do for now:

  • I will have to take a look at the tests after my changes.
  • As this approach generates MAC specific directories directly inside /var/lib/tftpboot/ the ownership of this directory has to be changed to user foreman-proxy.
  • One open request by @lzap was to provide differentiated bootloader files in the bootloader_universe not only by OS vendor but also by OS version.

@sbernhard @ekohl What do you think? Please share your thoughts. I'm looking forward to input and discussion.

@goarsna
Copy link
Author

goarsna commented Feb 28, 2024

Implemented changes as discussed in the course of the Foreman Community Forum thread.

Please let me know, if anything is missing or further changes are required.

@goarsna
Copy link
Author

goarsna commented Feb 29, 2024

Added deletion of host specific bootloader files in case host leaves build mode.

@sbernhard
Copy link
Contributor

[test smart-proxy]

@ekohl
Copy link
Member

ekohl commented Mar 6, 2024

ok to test

goarsna added a commit to ATIX-AG/puppet-foreman_proxy that referenced this pull request Mar 10, 2024
… to "Grub2 UEFI" PXE loaders

In the course of theforeman/foreman#9864 and theforeman/smart-proxy#877,
SecureBoot support for arbitrary operating systems has been added to the
"Grub2 UEFI" PXE loaders.
This patch adds a new parameter 'bootloader_universe' to the TFTP
configuration and a directory 'host_config' inside the TFTP root
directory, that are both required by the aforementioned PRs.
@goarsna
Copy link
Author

goarsna commented Mar 10, 2024

Squashed and changed commit message to reflect current implementation. => Any objections against merging this? What do you think, @nofaralfasi and @stejskalleos?

@goarsna goarsna changed the title Fixes #36833 - New PXE loader "Grub2 UEFI SecureBoot (target OS)" Fixes #36833 - Add SecureBoot support for arbitrary operating systems to "Grub2 UEFI" PXE loaders Mar 11, 2024
@goarsna
Copy link
Author

goarsna commented Aug 27, 2024

Hey Nofar, I finally made it to address your comments and to test the changes. I ran tests on AlmaLinux 8 and Rocky Linux 8 with the PXELinux BIOS, Grub2 UEFI and Grub2 UEFI SecureBoot PXE loaders with and without configured Bootloader Universe.

@goarsna
Copy link
Author

goarsna commented Sep 5, 2024

Hey Nofar, as discussed I had a look at the method length of setup_bootloader. I was able to reduce it to 20 lines by splitting out the parts for creating the symlinks. I tested my changes again on AlmaLinux 8, Rocky Linux 8 and Ubuntu 22.04 with the PXELinux BIOS, Grub2 UEFI and Grub2 UEFI SecureBoot PXE loaders with and without configured Bootloader Universe.

goarsna added a commit to ATIX-AG/foreman-documentation that referenced this pull request Sep 9, 2024
goarsna added a commit to ATIX-AG/foreman-documentation that referenced this pull request Sep 9, 2024
goarsna added a commit to ATIX-AG/foreman-documentation that referenced this pull request Sep 12, 2024
Copy link

@nofaralfasi nofaralfasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@goarsna Thank you for your efforts on this PR. The code is shaping up well, and I believe we're very close to finalizing it.

modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
Comment on lines 177 to 181
if bootloader_path
logger.debug "TFTP: Creating symlinks from bootloader universe."
create_bootloader_universe_symlinks(bootloader_path, pxeconfig_dir_mac)
else
logger.debug "TFTP: Creating symlinks from default bootloader files."
create_default_symlinks(bootfilename_efi, pxeconfig_dir_mac)
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the terms bootloader universe or default bootloader files might not be clear to users reading these logs. It would be better to either remove these messages or include the actual directory paths for better context.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these debug messages to have a short one liner to search for before the created symlinks are printed. The next four lines in the log should show the actual symlinks that are created. In my tests, for example for an AlmaLinux host the log states:

2024-09-18T14:24:59 fec247f7 [D] TFTP: Creating symlinks from bootloader universe.
2024-09-18T14:24:59 fec247f7 [D] TFTP: Creating relative symlink: /var/lib/tftpboot/host-config/00-11-22-33-44-55/grub2/grubx64.efi -> ../../../bootloader-universe/pxegrub2/almalinux/default/x86_64/grubx64.efi
2024-09-18T14:24:59 fec247f7 [D] TFTP: Creating relative symlink: /var/lib/tftpboot/host-config/00-11-22-33-44-55/grub2/shimx64.efi -> ../../../bootloader-universe/pxegrub2/almalinux/default/x86_64/shimx64.efi
2024-09-18T14:24:59 fec247f7 [D] TFTP: Creating relative symlink: /var/lib/tftpboot/host-config/00-11-22-33-44-55/grub2/boot.efi -> ../../../bootloader-universe/pxegrub2/almalinux/default/x86_64/boot.efi
2024-09-18T14:24:59 fec247f7 [D] TFTP: Creating relative symlink: /var/lib/tftpboot/host-config/00-11-22-33-44-55/grub2/boot-sb.efi -> ../../../bootloader-universe/pxegrub2/almalinux/default/x86_64/boot-sb.efi

=> Your second suggestion is already implemented. :) Would it then be ok to leave the one-liner as it is?

modules/tftp/tftp_plugin.rb Show resolved Hide resolved
Copy link

@nofaralfasi nofaralfasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more notes from my side:

  1. Since we are no longer sending the build parameter in the POST request when initializing the TFTP service, we follow the same procedure when the host enters build mode and when it exits build mode.
    IMO, there's no need to delete the existing symlinks and create new ones when the action performed is to disable build mode for a host.

  2. We currently lack tests for the new workflow. It would be great to have at least some tests for the symlink generation.

modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
modules/tftp/server.rb Outdated Show resolved Hide resolved
logger.debug "TFTP: Deploying host specific bootloader files to '#{pxeconfig_dir_mac}'."

FileUtils.mkdir_p(pxeconfig_dir_mac)
FileUtils.rm_f(Dir.glob("#{pxeconfig_dir_mac}/*.efi"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we delete these binaries here if we are already using force: true in FileUtils.ln_s, which overwrites existing symlinks?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to guarantee that we have a clean directory state on each rebuild of a host in case the files required in the host_config directory change in future.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only applies if the OS changes, as the bootloader names remain the same if the OS is unchanged, so there's no issue in that case. My goal is to minimize the risk of accidentally deleting necessary files (e.g., the binaries in /var/lib/tftpboot/grub2). However, since we validate the mac address beforehand (see here), I believe we’re safe in this regard.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another and again deeper look at what happens regarding MAC address validation when the setup_bootloader method is called and yes: The way setup_bootloader is called guarantees that the FileUtils.rm_f command can and will only delete *.efi files in the :tftproot:/host-config/<MAC_address>/ directory.

Although I have to admit that the cases I am thinking about here might be really rare corner cases [1], I would prefer to stick to deleting the present *.efi files on updating the symlinks.

[1] An example would be when an OS vendor decides to follow another naming scheme for the shim and GRUB2 binaries.

modules/tftp/server.rb Outdated Show resolved Hide resolved
… to "Grub2 UEFI" PXE loaders

This feature consists of four patches, one each for foreman,
smart-proxy, foreman-installer, and puppet-foreman_proxy.

This patch adds support for individual Network Bootstrap Programs (NBP)
in order to enable network based installations of SecureBoot enabled
hosts for arbitrary operating systems.

SecureBoot expects to follow a chain of trust from the initial boot of
the host to the loading of Linux kernel modules. The very first shim
that is loaded determines which distribution is allowed to be booted or
kexec'ed until next reboot.

Currently the "Grub2 UEFI SecureBoot" PXE loaders use NBPs provided by
the vendor of the Foreman/Smart Proxy host system. All hosts receive and
execute the same binary. On SecureBoot enabled hosts, this limits
installations to operating systems by the vendor of the Foreman/
Smart Proxy host system.

Providing shim and GRUB2 by the vendor of the operating system to be
installed allows Foreman to install any operating system on SecureBoot
enabled hosts over network.

To achieve this, the host's DHCP filename option is set to a shim/GRUB2
binary in a host specific directory based on their MAC address.
Corresponding shim and GRUB2 binaries are copied into that directory
along with the generated GRUB2 configuration files.
When provisioning a host, the Smart Proxy checks in a dedicated
directory inside the TFTP root - the so called "bootloader universe" -
if NBPs are present matching the operating system, operating system
version, and architecture of the host to be installed. If this is the
case, these NBPs are copied from the bootloader universe directory to
the host specific directory. If not, as a fallback the default NBPs
provided by the vendor of the Foreman/Smart Proxy host system are
copied from the `:tftproot:/grub2` directory to the host specific
directory.

Up to now, shim and GRUB2 binaries have to be retrieved and set up in
the bootloader universe directory manually according to the
documentation. An automatic way to provide OS dependent NBPs will be
added in future.

In case there are no NBPs present in the bootloader universe matching
the operating system, operating system version, and architecture of
the host to be installed, the behaviour of the "Grub2 UEFI" PXE
loaders does not change to the behavior prior to this feature.

Implementation notes:
---------------------
* To be future proof (e.g. to be able to provide NBPs in the bootloader
  universe for other PXE loaders without running into any filename
  conflicts) and for better structure, the PXE kind is prepended as a
  first directory level inside the bootloader universe.
* The operating system version inside the bootloader universe consists
  of the major and minor version (if applicable) of the operating system
  separated by a dot (`.`). If no NBPs are configured for a specific
  operating system version the fallback directory `default` is used.
* To simplify things on Foreman side in future, symlinks are used for
  the shim (boot-sb.efi) and GRUB2 (boot.efi) binaries.
* Inside the TFTP root directory a new directory `host-config` is
  created for storing all the host specific directories.
* Inside the TFTP root directory a new directory `bootloader-universe`
  is created for storing all the OS specific boot files.
* For storage efficiency the shim and GRUB2 binaries from the
  bootloader universe or the `:tftproot:/grub2` directory are
  symlinked to the host specific directory.

Full example:
-------------
[root@vm ~]# hammer host info --id 241 | grep -E "(MAC address|Operating System)"
MAC address: 00:50:56:b4:75:5e
Operating System: AlmaLinux 8.9

[root@vm ~]# tree /var/lib/tftpboot/bootloader-universe/
/var/lib/tftpboot/bootloader-universe/
└── pxegrub2
    └── almalinux
        ├── 8.9
        │   └── x86_64
        │       ├── boot.efi -> grubx64.efi
        │       ├── boot-sb.efi -> shimx64.efi
        │       ├── grubx64.efi
        │       └── shimx64.efi
        └── default
            └── x86_64
                ├── boot.efi -> grubx64.efi
                ├── boot-sb.efi -> shimx64.efi
                ├── grubx64.efi
                └── shimx64.efi

[root@vm ~]# hammer host update --id 241 --build true

[root@vm ~]# tree /var/lib/tftpboot/host-config
/var/lib/tftpboot/host-config
└── 00-50-56-a3-41-a8
    └── grub2
        ├── boot.efi -> ../../../bootloader-universe/grubx64.efi
        ├── boot-sb.efi -> ../../../bootloader-universe/shimx64.efi
        ├── grub.cfg
        ├── grub.cfg-00:50:56:a3:41:a8
        ├── grub.cfg-01-00-50-56-a3-41-a8
        ├── grubx64.efi -> ../../../bootloader-universe/grubx64.efi
        ├── os_info
        └── shimx64.efi -> ../../../bootloader-universe/shimx64.efi

[root@vm ~]# grep -B2 00-50-56-b4-75-5e /var/lib/dhcpd/dhcpd.leases
hardware ethernet 00:50:56:b4:75:5e;
fixed-address 192.168.145.84;
supersede server.filename = "host-config/00-50-56-b4-75-5e/grub2/boot-sb.efi";

[root@vm ~]# pesign -S -i /var/lib/tftpboot/host-config/00-50-56-b4-75-5e/grub2/boot-sb.efi | grep "Microsoft Windows UEFI Driver Publisher"
The signer's common name is Microsoft Windows UEFI Driver Publisher
@goarsna
Copy link
Author

goarsna commented Sep 29, 2024

Hey Nofar, once again thanks for your feedback! Regarding the two questions from your comment above:

  1. We currently lack tests for the new workflow. It would be great to have at least some tests for the symlink generation.

Indeed the PR is lacking tests 🙈 I will think about that and add some.

  1. Since we are no longer sending the build parameter in the POST request when initializing the TFTP service, we follow the same procedure when the host enters build mode and when it exits build mode.
    IMO, there's no need to delete the existing symlinks and create new ones when the action performed is to disable build mode for a host.

I wasn't able to finish my research on that and I am off on PTO until Monday, October 10th. But here are my current thoughts:
Yes, without the build flag the boot files get deployed when a host enters build mode as well as when it leaves build mode. Or better said: They get redeployed every time the TFTP configuration is redeployed. And I agree there is no need to redeploy them when a host leaves build mode. But (and that is were my research ended for now) with relying on the build flag the boot files also where redeployed in case a host left build mode. At least this must have been the case. Otherwise booting over network would not have worked anymore after a host has left build mode.
But after my vacation I will check back on that and evaluate more and head out to you.

@jloeser
Copy link

jloeser commented Sep 30, 2024

  1. Since we are no longer sending the build parameter in the POST request when initializing the TFTP service, we follow the same procedure when the host enters build mode and when it exits build mode.
    IMO, there's no need to delete the existing symlinks and create new ones when the action performed is to disable build mode for a host.

I wasn't able to finish my research on that and I am off on PTO until Monday, October 10th. But here are my current thoughts: Yes, without the build flag the boot files get deployed when a host enters build mode as well as when it leaves build mode. Or better said: They get redeployed every time the TFTP configuration is redeployed. And I agree there is no need to redeploy them when a host leaves build mode. But (and that is were my research ended for now) with relying on the build flag the boot files also where redeployed in case a host left build mode. At least this must have been the case. Otherwise booting over network would not have worked anymore after a host has left build mode. But after my vacation I will check back on that and evaluate more and head out to you.

I am not in the technical details here, but:

the difference between the two build modes (enabled/disabled) basically happens in the GRUB2 configuration file. Thus, symlinks to the correct OS related network boot files of the host is mandatory for installation (=build mode enabled) and booting from disk (=build mode disabled; loading local GRUB2 configuration from local disk).

The only event which should trigger creation of new symlinks is when the OS of the host changes, ideally independent of any build mode switch, but at latest when switching build mode. And to make it easier, this can happen in both switch cases for simplicity.

What we could think about is, if not already considered/implemented, to symlink also all the GRUB2 configuration files to avoid duplicates which might be confusing (e.g. /var/lib/tftpboot/host-config/00-11-22-33-44-55/grub2/grub.cfg-00-11-22-33-44-55 -> ../../../grub2/grub.cfg-00-11-22-33-44-55).

@nofaralfasi
Copy link

the difference between the two build modes (enabled/disabled) basically happens in the GRUB2 configuration file. Thus, symlinks to the correct OS related network boot files of the host is mandatory for installation (=build mode enabled) and booting from disk (=build mode disabled; loading local GRUB2 configuration from local disk).

The only event which should trigger creation of new symlinks is when the OS of the host changes, ideally independent of any build mode switch, but at latest when switching build mode. And to make it easier, this can happen in both switch cases for simplicity.

Thank you for pointing that out. So, what you're suggesting is that new symlinks should be created even when the host is not in build mode, as this might still be necessary? For instance, when a user updates the OS directly on the host and then updates Foreman to reflect the new version, the symlinks would still need to be updated. Please correct me if I’m misunderstanding.

What we could think about is, if not already considered/implemented, to symlink also all the GRUB2 configuration files to avoid duplicates which might be confusing (e.g. /var/lib/tftpboot/host-config/00-11-22-33-44-55/grub2/grub.cfg-00-11-22-33-44-55 -> ../../../grub2/grub.cfg-00-11-22-33-44-55).

I have no objections to doing that, but perhaps we should keep it outside the scope of this PR.

@jloeser
Copy link

jloeser commented Sep 30, 2024

Thank you for pointing that out. So, what you're suggesting is that new symlinks should be created even when the host is not in build mode, as this might still be necessary? For instance, when a user updates the OS directly on the host and then updates Foreman to reflect the new version, the symlinks would still need to be updated. Please correct me if I’m misunderstanding.

Correct. The described example by you is probably a corner case, but such an host wouldn't be able to reboot because build mode wasn't changed at all, thus no symlinks have been updated at all. This means the former OS vendor shim tries to boot the new OS kernel from disk (requires theforeman/foreman#10247).

Question is when the TFTP API is triggered. That's something you guys might now better. Is this only when switching build mode? Or also when a host was updated at all (edit host, click "Submit")?

I just wanted to point out that whenever TFTP API is triggered, (re)writing symlinks should not harm, right? I don't think we need to distinguish if build mode was enabled or if build mode was disabled - just always (re)write symlinks. At the end it's always the same target if there wasn't any OS change for the host.

I have no objections to doing that, but perhaps we should keep it outside the scope of this PR.

Yes, we can and should handle this separately.

@nofaralfasi
Copy link

Correct. The described example by you is probably a corner case, but such an host wouldn't be able to reboot because build mode wasn't changed at all, thus no symlinks have been updated at all. This means the former OS vendor shim tries to boot the new OS kernel from disk (requires theforeman/foreman#10247).

Question is when the TFTP API is triggered. That's something you guys might now better. Is this only when switching build mode? Or also when a host was updated at all (edit host, click "Submit")?

When build mode is changing, and also when updating a host.

I just wanted to point out that whenever TFTP API is triggered, (re)writing symlinks should not harm, right? I don't think we need to distinguish if build mode was enabled or if build mode was disabled - just always (re)write symlinks. At the end it's always the same target if there wasn't any OS change for the host.

I don’t think it should cause any harm. That's exactly what I want to clarify in this discussion. And I want to ensure that we're not performing unnecessary operations simply because they aren't harmful, but rather that there’s a valid reason for them.

@jloeser
Copy link

jloeser commented Oct 1, 2024

I don’t think it should cause any harm. That's exactly what I want to clarify in this discussion. And I want to ensure that we're not performing unnecessary operations simply because they aren't harmful, but rather that there’s a valid reason for them.

Hm, IMHO I could live with only (re)writing symlinks when build mode changes because I think it's not a valid use case to change the OS directly on the host. But as said, I think we do not need to distinguish between transitions (build mode enabled→disabled or disabled → enabled).

And of course if the PXE loader changes (which is part of host update, right), then we would also need to (re)write symlinks.

@nofaralfasi
Copy link

Hm, IMHO I could live with only (re)writing symlinks when build mode changes because I think it's not a valid use case to change the OS directly on the host. But as said, I think we do not need to distinguish between transitions (build mode enabled→disabled or disabled → enabled).

And of course if the PXE loader changes (which is part of host update, right), then we would also need to (re)write symlinks.

Why do we need to rewrite symlinks when the PXE loader changes? The symlinks are created for all hosts, even those that don’t use the Secure Boot PXE loaders, so it seems that changes to the PXE loader shouldn't have an impact.

However, if we choose to rewrite the symlinks regardless of whether the host is in build mode, then this concern may be less significant.

@nofaralfasi
Copy link

I’d like to summarize our discussion so far:
We are considering when to rewrite the symlinks for hosts. Currently, when a (managed) host is created, Foreman generates symlinks under /var/lib/tftp/host-config/MAC/grub2/, even if the host does not utilize Secure Boot PXE loaders.

With this approach, we rewrite a host’s symlinks every time it is updated, regardless of whether the host is in build mode. Our concern is that this adds unnecessary complexity to the workflow. However, there may be scenarios where this operation is needed, such as when changing the OS of a host without rebuilding it.

@jloeser @goarsna @stejskalleos I’d appreciate your thoughts on this.

@goarsna
Copy link
Author

goarsna commented Oct 7, 2024

Thanks, @jloeser and @nofaralfasi for all your input.

I read through all your posts and I agree to what Jan wrote:

The only event which should trigger creation of new symlinks is when the OS of the host changes, ideally independent of any build mode switch, ...

Yes, as the bootloader files depend on the operating system the symlinks should get updated when the operating system is changed and not depending on the build mode flag.

Regarding your example, Nofar:

For instance, when a user updates the OS directly on the host and then updates Foreman to reflect the new version, the symlinks would still need to be updated. Please correct me if I’m misunderstanding.

Jan said that this is a corner case, which may be true for most OSes. But if I remember correctly @lzap (please correct me in case I am wrong) mentioned once in a call that for Red Hat Enterprise Linux the shim has to match the EL minor version. So for RHEL it is necessary to redeploy the boot files on an OS change.
Lets have a look at the following example:

  1. We have a RHEL 8.10 host with Secure Boot enabled.
  2. This host consumes RHEL 8.10 specific boot files from :tftproot:/bootloader-universe/pxegrub2/redhat/8.10/x86_64/.
  3. We put boot files for RHEL 9.4 into :tftproot:/bootloader-universe/pxegrub2/redhat/9.4/x86_64/
  4. The host gets updated to RHEL 9.4, Secure Boot is still enabled.
  5. The OS is changed in Foreman to reflect the OS update on the RHEL host.

=> Without updating the symlinks, the host won't be able to boot anymore as it still consumes the RHEL 8.10 boot files.
=> With updating the symlinks, the host will be able to boot as he now consumes boot files from the 9.4 directory.

Also IMO updating the symlinks every time the TFTP configuration is redeployed should not harm. But I also agree with what Nofar said: We should not perform unnecessary operations just because they do not harm.

So the best would be to only update the symlinks in the cases we require them to get updated and this is when the OS of a host is changed. And a check if the OS has been changed is already implemented in the queue_tftp_update method in Foremans tftp.rb#R169. I didn't take a deep look at the code for now but my first thought was that we could reuse the check in the proxy.set method called in tftp.rb#R90-R96 and extend it to also evaluate to true in case the host gets created (for example check if old.host is nil?) so that the symlinks get initially set when the host gets created and updated every time the OS changes.

Hui... This was a lot of text now... Sorry for that 😅 😄

Edit: Ok, I should add a small summary as my post was really pretty long:

  1. We should not rely on the build flag to determine weather the symlinks should be updated
  2. We could implement to only update them on OS changes.

What do you think, @jloeser and @nofaralfasi?

And finally:

What we could think about is, if not already considered/implemented, to symlink also all the GRUB2 configuration files to avoid duplicates which might be confusing (e.g. /var/lib/tftpboot/host-config/00-11-22-33-44-55/grub2/grub.cfg-00-11-22-33-44-55 -> ../../../grub2/grub.cfg-00-11-22-33-44-55).

I like your suggestion and would be happy to see / do that in a follow-up PR. :)

@goarsna
Copy link
Author

goarsna commented Oct 8, 2024

After posting I kept on thinking about further possible corner cases and a second scenario came to my mind where we (or better said the users) have to update the symlinks (If I recall correctly we once already talked / wrote about this case for short, but I want to bring it up here again to be aware of it):

I am thinking about the case that certificates get revoked and a host suddenly has to use boot files from a OS- and version-specific directory because we want to boot newer hosts from the OS-specific default directory.

Lets have a look at following fictional example:

  1. We have an AlmaLinux 8.10 host and use symlinks to the OS-specific and version-independent boot files like :tftproot:/bootloader-universe/pxegrub2/almalinux/default/x86_64/ to boot that host.
  2. The certificate AlmaLinux used gets revoked so that for hosts running newer AlmaLinux OSes like 9.4 we require different shim and GRUB2 binaries.
  3. We want to put these new boot files to use for AlmaLinux > 8.10 into the default folder at :tftproot:/bootloader-universe/pxegrub2/almalinux/default/x86_64/.
  4. As result we have to set up the boot files for AlmaLinux 8.10 at the version-specific folder at :tftproot:/bootloader-universe/pxegrub2/almalinux/8.10/x86_64/.

=> The AlmaLinux 8.10 host will still boot using the symlinks that point to the :tftproot:/bootloader-universe/pxegrub2/almalinux/default/x86_64/ directory which leads to a not bootable system.

But this case is a bit tricky as we have no change to the host in Foreman to determine that the symlinks have to be updated. The best would be to be able to manually trigger a redeployment of the TFTP configuration to update the symlinks (or is there already a possibility to manually trigger this?!). But IMO if we want to provide an somehow automated way to do this, it should definitely be a follow-up task. For now, users have to manually adjust the symlinks in this case.

@nofaralfasi
Copy link

@goarsna I completely agree with everything you’ve outlined here. Great catch on identifying these potential scenarios.

@goarsna
Copy link
Author

goarsna commented Oct 9, 2024

@nofaralfasi Thanks! :D Then I will try to find time to implement the "Update the symlinks only on OS changes"-change ASAP :) Nevertheless input, thoughts, ack by @jloeser / @stejskalleos is still appreciated :)

@nofaralfasi
Copy link

@nofaralfasi Thanks! :D Then I will try to find time to implement the "Update the symlinks only on OS changes"-change ASAP :) Nevertheless input, thoughts, ack by @jloeser / @stejskalleos is still appreciated :)

After discussing with the team, everyone agreed there's no need to add extra logic for updating the symlinks only when the OS changes. We can proceed with keeping things as they are, with no further changes needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants